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Abstract — An ensemble of LDPC convolutional codes with 
parity-check matrices composed of permutation matrices is 
considered. The convergence of the iterative belief propagation 
based decoder for terminated convolutional codes in the ensem- 
ble is analyzed for binary-input output-symmetric memoryless 
channels using density evolution techniques. We observe that the 
structured irregularity in the Tanner graph of the codes leads to 
significantly better thresholds when compared to corresponding 
; LDPC block codes. 

I. Introduction 

Low-density parity-check (LDPC) block codes, invented by 
Gallager [1], have been shown to achieve excellent perfor- 
, mance on a wide class of channels. The convolutional counter- 
' parts of LDPC block codes, LDPC convolutional codes, have 
"been described in [2] [3] [4]. Both LDPC block and convolu- 
tional codes are defined by sparse parity-check matrices and 
[ can be decoded iteratively using message passing algorithms 
■ (e.g., belief propagation) with complexity per bit per iteration 
' independent of the block length or constraint length. This 
makes iterative decoding of LDPC codes with large block 
length or constraint length feasible. 

In [5], the existence of a sequence of {J, K) regular^ 
LDPC convolutional codes for which an arbitrary number of 
independent iterations is possible was demonstrated. Based on 
, this result, it follows that the threshold of (J, K) regular LDPC 
' convolutional codes is at least as good as the threshold of 
' (J, K) regular LDPC block codes for any message passing 
algorithm and channel. Moreover, simulation results on the 
additive white Gaussian noise channel (see [3] [4]) indicate 
the possibility that LDPC convolutional codes may have better 
thresholds than corresponding LDPC block codes. 

In this paper we consider a class of regular LDPC convo- 
' lutional codes with parity-check matrices composed of blocks 
of randomly constructed M x M permutation matrices. For 
the erasure channel, iterative belief propagation decoding 
of terminated LDPC convolutional codes in this class was 
analyzed in [6]. There it was shown that the termination leads 
to a structured irregularity in the Tanner graph, and that this 
structured irregularity leads to significantly better thresholds 
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^{J,K) regular LDPC codes are defined by parity-check matrices having 
J ones in each column of the matrix and K ones in each row of the matrix. 
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Fig. 1. Syndrome former matrix for a code in Cp(3, 6, M). 

compared to corresponding randomly constructed regular and 
irregular LDPC block codes. Further, it was observed that the 
thresholds approach the capacity of the erasure channel. In this 
paper we generalize the techniques of [6] to arbitrary binary- 
input memoryless channels and give numerical examples for 
the AWGN channel. 

II. Convolutional Code Ensemble 

A rate R = b/c binary convolutional code can be defined 
as the set of sequences v — {. . . , V-i,Vo, Vi, . . .), G F|, 
satisfying the equality vH^ — 0, where the infinite syndrome 
former matrix H is given by 

i?J(0) . 



\ 



J 



and each (< + i) is a c x (c 



b) binary matrix. If 
defines a rate R — b/c convolutional code, the matrix 
must have full rank for all time instants t. In this case, by 
suitable row permutations we can ensure that the last (c — 
b) rows are linearly independent. Then the first b symbols at 
each time instant are information symbols and the last (c — b) 
symbols the corresponding parity symbols. The largest i such 
that Hj{t + i) is a nonzero matrix for some t is called the 
syndrome former memory m^. 



LDPC convolutional codes have sparse syndrome former 
matrices. A (J, K) regular LDPC convolutional code is defined 
by a syndrome former that contains exactly J ones in each row 
and K ones in each column. 

We now define the ensemble of LDPC convolutional codes 
of interest. Though the ensemble can be defined more gen- 
erally, in this paper we focus on the case K — 2J, J > 2. 
We construct LDPC convolutional codes defined by syndrome 
formers with syndrome former memory — J ~1. For 
i — 0,1, ... , J— I, the sub-matrices hJ (t+i) of the syndrome 

former are the matrices (^P';['\t + i) , p[^\t + i)j , where 

P\'''\t + i), ft- = 0, 1, is an A/ X M permutation matrix. 
All other entries of the syndrome former are zero matrices. 
Equivalently, each Hj{t+i), i = 0,1, . . . , J— 1, is a cx (c— 6) 
binary matrix, where c = 2M and b = M. By construction 
it follows that each row of the syndrome former has J 
ones and each column K ones. Let Cp{J, 2 J, M) denote this 
ensemble of (J, 2 J) regular LDPC convolutional codes. (Note 
that the ensemble of codes Cp{J,2J,M) is time-varying.) 
Fig. [0 shows the syndrome former matrix of a (3, 6) regular 
LDPC convolutional code in Cp(3, 6, M). 

Since H^{t) consists of two non-overlapping permutation 
matrices, it has full rank. Hence defines a rate R — 
code. Further, the constraint imposed by the syndrome former, 
i.e., 

vtH^^it) + vt-iHj{t) + ■■■+ vt-mH^Jt) = 0, (1) 

where Vt £ ,t G Z, can be used to perform a systematic 
encoding of the code [2]. The constraint length of codes in 
Cp{J,2J,M) is defined as i/ = {rris + 1) ■ c = J ■ 2M = 
KM. Thus, the constraint length of codes in the ensemble 
Cp(3,6,M) is 6M. 

The Tanner graph for a code in Cp{J,2J,M) can be 
obtained from its syndrome former matrix. The graph consists 
of symbol and check nodes, each symbol node corresponding 
to a particular row and each check node corresponding to a 
particular column of the syndrome former matrix . There 
is an edge between a symbol node and a check node if the 
corresponding symbol takes part in the respective parity-check 
equation. For the Tanner graph of a convolutional code we 
can associate a notion of time. At each time instant t the sub- 
matrices Hi(t) of the syndrome former lead to c—b check 
nodes in the Tanner graph. Similarly, for each time instant t, 
there are c symbol nodes in the Tanner graph. Observe that 
Hi{t) is non-zero only from i — 0,1,..., m^, and hence 
nodes in the Tanner graph can be connected at most time 
units away. The Tanner graph of a code in the ensemble 
C{M, 2M, 3) is comprised of c = 2M symbol nodes and 
c — 6 = M check nodes for each time instant. Further, each 
node can be connected at most TOj = 2 time units away. 

For practical applications, a convolutional encoder starts 
from a known state (usually the all-zero state) and, after 
the data to be transmitted has been encoded, the encoder is 
terminated back to the all-zero state. It can be shown that for 
the ensemble Cp{J,2J, M) we need a tail for no more than 
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Fig. 2. Tanner graph of a terminated convolutional code obtained from 
Cp(3,6,Af). 

TOs + 1 time instants, i.e., irris + 1)M information bits to return 
the encoder to the all-zero state [7]. 

Suppose that we wish to transmit LM information bits using 
a code from Cp{J,2J, M). It follows that the terminated code 
has rate R = 0.5/(1 + -j). Note that for L » J, the rate 
loss is negligible. In Fig. |2l we show the Tanner graph of a 
terminated code obtained from a convolutional code in the 
ensemble C{M,2M,3). Observe that symbols are zero both 
before encoding begins, i.e., t ~ 1, and after termination, 
i.e., t — L + 3. Hence in obtaining the Tanner graph of 
the terminated convolutional code, edges connecting check 
nodes to any of the symbol nodes that are known to be zero 
can be omitted. For example, we can disconnect the check 
nodes at time t = 1 from symbol nodes at time t < 1, since 
these are known to be zero. It follows that, while all symbol 
nodes in Fig. |2l have degree three, the check nodes can have 
degree either two, four, or six. Note that, even though the 
convolutional code is regular, knowing bits perfectly before 
encoding and after termination leads to a slight irregularity in 
the Tanner graph of the terminated convolutional code. 

III. Decoding Analysis for Binary-Input 
Memoryless Channels 

As for block codes, an iterative decoder for LDPC convo- 
lutional codes can be conveniently described on the Tanner 
graph. In each decoding iteration messages are exchanged 
between the symbol nodes and the check nodes. We consider 
an algorithm equivalent to the probabilistic iterative decoding 
algorithm proposed by Gallager, which in a wider context is 
known as belief propagation or the sum-product algorithm. 

At a check node extrinsic LLRs are computed by decod- 
ing the associated single parity-check component code. The 
message received by a symbol node from its jth neighboring 
check node, j = 1, . . . , J, can be written as 



/3(-'') ^ 2arctanli 



(2) 



where z'^''\ k' — {!,..., K} \ k, are the messages that this 
check node has received from its other adjacent symbol nodes. 
The incoming extrinsic LLRs are then combined with the 



intrinsic channel LLR a of the considered symbol to give the 
LLRs 



J 



1, 



.,J 



(3) 



which form the messages to be sent back to the check nodes. 
Initially, before the first decoding iteration, the LLRs are set 
to z^-'-' = a for symbols at times t = I, . . . , L + J. For all 
other t the code symbols are defined to be zero, which implies 
that z^-'^ — oo through all iterations. This initial condition 
automatically takes into account the lower check node degrees 
at the beginning and the end of the Tanner graph. 

We consider the standard parallel updating schedule where, 
in each decoding iteration, first all check nodes and then all 
symbol nodes are updated according to (|2jl and Q, respec- 
tively. The messages computed in this way are true LLRs as 
long as they are produced from independent observations. The 
following theorem guarantees that the number of independent 
iterations possible on the Tanner graph of the block code, pro- 
duced by terminating convolutional codes from the ensemble 
Cp{J, 2 J, M), can be made arbitrarily large. 

Theorem 1: For any length L there exists a code in 
Cp{J, 2 J, M) for which the number of independent decoding 
iterations, ^o, satisfies 



h > 



logM 



ci 



□ 



21og(2J- 1)(J- 1) 

where the constant ci does not depend on M. 
The proof of this theorem is based on an analogous theorem 
for LDPC block codes given in [8]. Given that all messages 
are formed from independent observations, it is possible 
to calculate the evolution of their exact probability density 
functions (pdfs) during the iterations [9] (density evolution). 
These pdfs can be used to find an upper bound on the 
smallest channel SNR (convergence threshold) for which the 
error probability converges to zero as the number of iterations 
goes to infinity. Since, in general, density evolution must be 
performed numerically, we follow the approach in [8] and 
estimate the asymptotic convergence rate by observing aside 
from the pdfs of the LLRs z^^'' also their Bhattacharyya 
parameter. 

For regular LDPC block codes, the distribution of the 
messages exchanged in iteration £ is the same for all nodes 
regardless of their position within the graph. Likewise, for 
the random irregular code ensembles considered in [9], the 
message distributions are averaged over all codes and only a 
single mixture density need be considered for all check nodes 
and all symbol nodes, respectively. Looking at the flow of 
messages in the Tanner graph it can be seen that this is not 
true in our case. As shown in Fig. |3ja), messages z'*^ in 
(|2j come from nodes belonging to different time instants. The 
same holds for the messages that are combined in (|3} (see 
Figure |3lb)). Fig. 0] shows the first level of the corresponding 
decoding computation trees for the first three symbol levels 
in the case J = 3. Although only the first J — 1 levels of 
check nodes have lower degrees, and hence provide better 
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Fig. 3. Illustration of the messages (a) to a symbol node and (b) to a check 
node for the case J = 3. 
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Fig. 4. The first level of the computation trees for i = 1, 2, 3 with J = 3. 

protection, their effect propagates through the complete graph. 
Consequently, while nodes at the same time instant behave 
identically, the messages from nodes at different times behave 
differently and must all be tracked separately. To take this 
into account, for each level different pdfs must be computed 
in every iteration for both and z^-^\ j = 1, . . . , J. 

Consider now the ^th decoding iteration, where 1 < i < io. 
Let (pfl_^_f,{z\0) and ipfj_^_f,{z\l) be the pdfs of messages sent 
from the node of a code symbol t>,„ at time t to one of its 
neighboring check nodes at time t + k, conditioned on Vm = 
and Vm — respectively. The Bhattacharyya parameter ^^j. 
of these messages, fc = 0, . . . , J — 1, is equal to 



B 



t.t+k 



^<p'^l,iz\0)<p^^l,iz\l)dz 



(4) 



For the intrinsic channel LLRs a, the Bhattacharyya parameter 
is derived analogously from the channel transition pdf and is 
denoted as A. The following lemma, analogous to Lemma 1 
in [8], connects the Bhattacharyya parameters corresponding 
to the LLRs of two consecutive decoding iterations. 

Lemma 1: The Bhattacharyya parameter B\ /^^j,, defined by 
for I ~ 1. . . . , £o and k ~ 0, . . . , J — 1, satisfies the 
following inequality 
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where k', i' G {0, . 



, J - 1} and S^"^ 
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0. 



, (5) 



□ 



If we define Bmax as the largest value of B^^^^f^ over all edges 



in the graph, i.e., 



t,k 



then it follows from (|5} that 



B 



.7-1 



(6) 



(7) 



Suppose now that after some iteration £' < £o all Bhat- 
tacharyya parameters Bj. ^jj^j, are smaller than the breakout 
value 

i3b,-A-i/(^-i)(2J-l)-(^-i)/(-^-2) . (8) 

As described in [8], it follows then from that after £o 
decoding iterations the bit error probability {t) for symbols 
at an arbitrary time t satisfies 



Pb(i) < B^^ < 



B, 



if". 



Bbi 



This shows that the bit error probability of all code symbols 
converges to zero at least double exponentially with the 
number of decoding iterations. 

To determine convergence thresholds for terminated codes 
from the ensemble Cp{J,2J, M) it is possible to numerically 
evaluate, iteration by iteration, the different pdfs Vt^t+fcl-^^l') 
for all time instants t. For proving that the bit error probabil- 
ities of all symbols converge to zero, it is sufficient to check 

(£') 

that Bmax is below the breakout value B\,r after some number 
of iterations The convergence threshold for an ensemble 
of codes can be found by testing this condition for different 
channel values. 

Note that, in addition to the node degrees J and K, the 
value L is another parameter that influences the result. For 
small L there is a significant rate loss due to the termination, 
and results for the erasure channel show that the threshold 
can even surpass the capacity of codes with rate i? = 1/2 
[6], which can be explained by the large fraction of strong 
check nodes of low degree. It has also been observed in [6] 
that for large L the threshold for terminated convolutional 
codes remains constant. This is especially interesting since 
with increasing L the degree distribution becomes closer and 
closer to that of a regular (J, 2 J) block code, which has a 
significantly weaker threshold. Hence, the improved threshold 
can be attributed to the special structure of the Tanner graph 
imposed by the convolutional nature of the codes and not only 
to the ratio of stronger to weaker nodes. 

While increasing L reduces the rate loss, the computational 
burden of performing density evolution becomes increasingly 
difficult for larger L. Both the number of different pdfs to 
be tracked and the number of iterations until the effect from 
strong nodes at the ends of the graph carries through to the 
levels in the middle increase with L. Also, for the AWGN 
channel, the complexity of density evolution is much higher 
than for the erasure channel, where a simple one-dimensional 
recursion formula can be used. In the next section we consider 
therefore a sliding window updating schedule that reduces the 
number of operations required for the threshold computation. 



IV. Threshold Computation: A Sliding Window 
Approach 

In the standard parallel updating schedule, considered in 
the previous section, first all check nodes and then all symbol 
nodes are activated in each iteration. This is convenient for an 
analysis of decoding since then the computation trees of all 
symbols have a very regular structure. An alternative schedule, 
where a symbol node is activated whenever a message is de- 
manded by a neighboring check node, was considered in [10]. 
The check nodes are activated one by one. It has been observed 
in computer simulations that such an on-demand symbol node 
update can reduce the required number of decoding iterations. 

In general, any arbitrary node activation order in the de- 
coding will result in a particular shape of the computation 
trees of the different code symbols. Since, for any finite 
number of node activations, the depths of the computation 
trees will be finite as well, it follows that these trees can be 
covered by trees corresponding to a parallel updating schedule 
with a sufficient number of iterations. Consequently, any node 
updating schedule can be interpreted as a parallel schedule 
where certain node activations are omitted. But, under the 
independence assumption, such an omission of additional side 
information can never improve the performance of decoding, 
which results in the following proposition. (A similar result 
has also been obtained in [11].) 

Proposition 1: Consider density evolution with an arbitrary 
node activation order Assume that the breakout value con- 
dition, described in the previous section, is satisfied after a 
specific number of node activations. Then this condition will 
also be satisfied for a standard parallel updating schedule after 
a sufficiently large number of iterations. □ 

Let us now consider the following updating schedule in 
density evolution. In a window from level t — t' lo level 
t = min(t' + W - 1, L/2), W < L/2, ah symbol nodes are 
activated one by one according to (|3}. Check nodes are acti- 
vated according to (|2j whenever a neighboring symbol node 
demands a message. The starting position of this updating 
window of size W is initialized by t' = 1. The nodes within 
the window are updated repeatedly until the error probability 
at level t' reaches some value that corresponds to a sufficiently 
small Bhattacharyya parameter Bq, < Bq < B^i-. Then the 
window is shifted by increasing t' by one. (Note that we 
only have to consider levels t — 1, . . . ,L/2 because of the 
symmetry within the Tanner graph.) 

For J = 3, Fig.|5]shows the required number of updates per 
level (computational complexity) and the bit error probability 
at different levels as a function of the updates. In this example 
the signal-to-noise ratio E^/Nq = 0.55 dB of the AWGN 
channel corresponds to the estimated threshold, which explains 
the high number of required updates. Due to the window 
approach the number of updates per level increases until it 
reaches its maximum at t = 20. After that it remains at a 
constant level until the window reaches the middle of the 
Tanner graph, where (symmetry) effects from the other end of 
the graph result in a reduction of updates. It has been observed 
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Fig. 5. Updates per level and bit en'or probability behavior for J = 3, 
L = 100, and = 20 at E^,/Nq = 0.55 dB. 

that after the window size W exceeds a certain minimal value, 
the bit error curves no longer change if W is further increased. 
This helps us to choose the parameter W in our calculations. 
But more importantly, we can conclude that this approach is 
as good as a non- windowed updating schedule. 

It can also be seen in Fig. |5l that the bit error curves for 
levels t — 25 and i = 30 are almost indistinguishable. This 
is actually the case in the complete region of levels where 
the number of required updates stays constant. This indicates 
that the performed calculations tend to repeat themselves at 
different window positions. From this we may conclude that 
the effect of the strong nodes at the ends of the Tanner 
graph carries through to the middle independently of the 
termination length L. This confirms the observation on the 
erasure channel in [6] that the threshold remains constant for 
large L and that the number of iterations required for reaching 
a certain bit error probability at a node in the middle of 
the graph increases linearly with L. Furthermore, it suggests 
that detecting convergence at the first levels is sufficient to 
determine the overall convergence threshold. 

The same observations can be made for the erasure channel, 
for which we can state the following result. 

Proposition 2: Consider density evolution on the erasure 
channel with the window updating schedule described above 
for an arbitrary termination length L. Starting from t' = 1, 
assume that t' is increased as soon as the Bhattacharyya 
parameter at level t' is below some value Bq < i?br. Under 
these conditions, if the window can be shifted at least J times, 
then Bq can be reached at all t, 1 < < < L. □ 

This proposition can be proved by induction when the 
updating window is initialized by the same pdfs at different 
window positions t'. Here we make use of the fact that the pdfs 
ip\^l^j,{z\-) computed within density evolution can be ordered 
in terms of quality. For the erasure channel this follows from 
the fact that the pdfs are described by a single parameter, the 
probability of erasure. For other channels such an ordering of 
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TABLE I 

Thresholds for the ensembles Cp( J, 2 J, M) with different J. 

the pdfs ipfj^f,{z\-) is not obvious. However, we conjecture 
that the statement of Proposition |2] is true for arbitrary binary- 
input memoryless channels. 

In TableHlthe estimated thresholds {E^,/Nq)* for the AWGN 
channel are presented for different J. In the computations the 
values of L were chosen such that there is a rate loss of 2%, 
i.e., R — 0.49. Assuming that the thresholds are independent 
of the termination length L, the right hand side of the table 
shows the corresponding threshold values (£'b/A^o)** for the 
case L ^ oo. Similar to the erasure channel results, the 
thresholds are much better than those of the corresponding 
regular LDPC block codes (e.g., 1.11 dB for J — 3), and tend 
to the capacity limit of rate i? = 1/2 codes with increasing 
J. 
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